Using R in hydrology (EGU2017 short course)

These slides and all other course materials can be found at:

github.com/brry/rhydro #slides

To get all the material including the datasets and presentation source code, we recommend to download the whole github course repository.

This is an R Markdown Notebook.
For discussions, please visit the Hydrology in R Facebook group.
Before running the code blocks below, we suggest to get package installation instructions by running:

source("https://raw.githubusercontent.com/brry/rhydro/master/checkpc.R")


Aim and contents of this workshop

We want to:

  • Show off how awesome R is for hydrology (it’s R-some!^^)
  • Convince you to start or continue using R
  • Provide all the code for you as a starting point

We can not:

  • Teach you actual R coding (90 mins is too short for a tutorial)

We have prepared:

 

Before we get started, please let us know your current R knowledge level by filling out the short survey at
bit.ly/knowR


top

Report

Good coding practice, report generation (Rstudio, rmarkdown, R notebook)
Daniel Klotz

Introduction

goals

goals

Why I use R

Why I did not use:

equals

equals

Whats great about R:

  library(ggplot2)
  test_data <- mpg
  test_plot <- ggplot(test_data, aes(displ, hwy, colour = class)) + 
    geom_point()
  test_plot

Why I decided to use R:

pipe

pipe

Previously:

  aggregation_function <- function(x) {
    round(mean(x),2)
  }
  mtcars_subset <- subset(mtcars,hp > 100)
  mtcars_aggregated <- aggregate(. ~ cyl, data = mtcars_subset, FUN = aggregation_function)
  car_data1 <- transform(mtcars_aggregated, kpl = mpg*0.4251)
  print(car_data1)

Now:

library(magrittr)
car_data2 <- 
  mtcars %>%
  subset(hp > 100) %>%
  aggregate(. ~ cyl, data = ., FUN = . %>% mean %>% round(2)) %>%
  transform(kpl = mpg %>% multiply_by(0.4251)) %>%
  print

btw: You can integrate other programming languages with ease. Here an example from Yihui Xie for the use of Fortran in Rmarkdown:

  1. Compile Code:

    C Fortran test
          subroutine fexp(n, x)
          double precision x
    C  output
          integer n, i
    C  input value
          do 10 i=1,n
             x=dexp(dcos(dsin(dble(float(i)))))
      10  continue
          return
          end
  2. Run Code:

    res = .Fortran("fexp", n=100000L, x=0)
    str(res)

Be happy with the result: > ## List of 2 > ## $ n: int 100000 > ## $ x: num 2.72


Markdown

HTML: HyperText Markdown Language

John Gruber:

John Gruber

Comparison: Markdown vs. Latex Comparison

Rstudio provides cheat-sheets with the most important informations about many of their “favorite” packages & software:

Cheat Sheet

Rmarkdown

In Rstudio:

Rmark1

Rmark2

Rmark2

“Native” Formats:

Much more possible if you adress pandoc directly: pandoc

Information in the text can be automatically updated with the rest of the document: [time for coffee

Examples

Small Websites

Books (blogdown)

bookdown1 bookdown2

Blogs (hugo)

hugo

Widgets

cran-gauge superzip

Presentations

bookdown1

Apps (Shiny)

shiny shiny2


top

GIS

Using R as GIS (reading a rainfall shapefile + Kriging, sf + leaflet + mapview + OSMscale)
Berry Boessenkool

Shapefiles

Reading shapefiles with maptools::readShapeSpatial and rgdal::readOGR is obsolete.
Instead, use sf::st_read. sf is on CRAN since oct 2016.
Main reaction when using sf: “Wow, that is fast!”
Download the shapefile or better: download the whole github course repository

rain <- sf::st_read("data/PrecBrandenburg/niederschlag.shp")
centroids <- sf::st_centroid(rain)
centroids <- sf::st_coordinates(centroids)

top

Plotting, maps

Static plot:

plot(rain[,1])

Static map:

prj <- sf::st_crs(rain)$proj4string
centroids <- as.data.frame(centroids)
cent_ll <- OSMscale::projectPoints(Y,X, data=centroids, to=OSMscale::pll(), from=prj)
map_static <- OSMscale::pointsMap(y,x, cent_ll, fx=0.08, type="maptoolkit-topo", zoom=6)

Interactive map:

library(leaflet)
cent_ll$info <- paste0(sample(letters,nrow(cent_ll),TRUE), ", ", round(cent_ll$x,2), 
                       ", ", round(cent_ll$y,2))
leaflet(cent_ll) %>% addTiles() %>% addCircleMarkers(lng=~x, lat=~y, popup=~info)

Interactive map of shapefile:

# devtools::install_github("environmentalinformatics-marburg/mapview", ref = "develop")
library(berryFunctions) # classify, seqPal
col <- seqPal(n=100, colors=c("red","yellow","blue"))[classify(rain$P1)$index]
mapview::mapview(rain, col.regions=col)

top

Kriging

Plot original points colored by third dimension:

pcol <- colorRampPalette(c("red","yellow","blue"))(50)
x <- centroids$X # use cent_ll$x for projected data
y <- centroids$Y
berryFunctions::colPoints(x, y, rain$P1, add=FALSE, col=pcol)

Calculate the variogram and fit a semivariance curve

library(geoR)
geoprec <- as.geodata(cbind(x,y,rain$P1))
vario <- variog(geoprec, max.dist=130000) # other maxdist for lat-lon data
fit <- variofit(vario)
plot(vario)
lines(fit)

Determine a useful resolution (keep in mind that computing time rises exponentially with grid size)

# distance to closest other point:
d <- sapply(1:length(x), function(i)
            min(berryFunctions::distance(x[i], y[i], x[-i], y[-i])) )
# for lat-long data use (2017-Apr only available in github version of OSMscale)
# d <- OSMscale::maxEarthDist(y,x, data=cent_ll, fun=min)
hist(d/1000, breaks=20, main="distance to closest gauge [km]")
mean(d/1000) # 8 km

Perform kriging on a grid with that resolution

res <- 1000 # 1 km, since stations are 8 km apart on average
grid <- expand.grid(seq(min(x),max(x),res),
                    seq(min(y),max(y),res))
krico <- krige.control(type.krige="OK", obj.model=fit)
#krobj <- krige.conv(geoprec, loc=grid, krige=krico)
#save(krobj, file="data/krobj.Rdata")
load("data/krobj.Rdata") # line above is too slow for recreation each time

Plot the interpolated values with or an equivalent (see Rclick 4.15) and add contour lines.

par(mar=c(0,3,0,3))
geoR:::image.kriging(krobj, col=pcol)
colPoints(x, y, rain$P1, col=pcol, legargs=list(horiz=F, title="Prec",y1=0.1,x1=0.9))
points(x,y)
plot(rain, col=NA, add=TRUE)


top

Discharge

River discharge time-series visualisation and extreme value statistics (animation + extremeStat)
Berry Boessenkool


top

Hydmod

Hydrological modelling with airGR
Katie Smith


top

EDA

Exploratory Data Analysis including flow duration curve and trend analysis on time-series
Shaun Harrigan


top

Discussion

Please give us feedback at bit.ly/feedbackR

For discussions, please use the Hydrology in R Facebook group.